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Manufactured sand has high potential for replacing natural 
sand and reducing the negative impact of the construction 
industry on the environment. This paper aims at developing a 
novel deep  learning-based approach for estimating the 
compressive strength of manufactured-sand concrete. The 
deep neural networks are trained by the advanced optimizers 
of Root Mean Squared Propagation, Adaptive Moment 
Estimation, and Adaptive Moment Estimation with Nesterov 
momentum (Nadam). In addition, the activation functions of 
logistic sigmoid, hyperbolic tangent sigmoid, and_ rectified 
linear unit activation are employed. A dataset including 132 
samples has been used to train and verify the deep neural 
networks. Stone powder content, sand ratio, quantity of 
cement, quantity of water, quantity of coarse aggregate, 
quantity of water-reducer, quantity of manufactured sand, 
concrete slump, unit weight of concrete, and curing age are 
utilized as predictor variables. Based on experiments, the 
Nadam-optimized model used with the sigmoid activation 
function has achieved the most desired performance with 


root mean square error (RMSE) = 1.95, mean absolute 
percentage error (MAPE) = 3.04%, and _ coefficient of 
determination (R*) = 0.97. Thus, this neural computing 


model is recommended for practical purposes because it can 
help to mitigate the time and cost dedicated to laboratory 
work. 
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1. Introduction 


Concrete has been extensively employed in the construction industry because it features many 
advantageous engineering properties. When combined with steel reinforcement, reinforced 
concrete achieves high strength and durability. In addition, concrete material offers good 
resistance to water and high temperature. Usual concrete mixtures include binding material (e.g. 
Portland cement), coarse aggregate, fine aggregate, and water. The components of a concrete 
mixture is typically not costly and can be easily assessed. The aforementioned features of 
concrete make this construction material highly suitable for a wide range of civil and 
infrastructure projects [1,2]. 


In Vietnam as well as in other countries around the globe, demand for sand rises at a fast pace 
due to the rapid infrastructure development. Therefore, natural sources of sand barely satisfy the 
domestic demand and sand dearth becomes apparent [3,4]. Accordingly, researchers and 
practicing engineers have resorted to using manufactured sand made from crushed rocks (e.g. 
granite, basalt, and other sand stones) as an alternative to natural sand [5,6]. 


Since concrete using manufactured sand is highly potential for solving the issue of sand dearth 
and mitigating the effect of the construction industry on the natural environment, various studies 
have dedicated to the investigation of this material’s mechanical properties [7,8] . In concrete 
design, compressive strength (CS)is widely regarded as the most crucial index [9-11]. Other 
properties such as elastic modulus and water tightness can be inferred via their correlations with 
the CS [12]. Estimating the CS of a concrete mixture containing manufactured sand based on its 
components is particularly important for mixture design. It is because if this parameter is 
correctly predicted, time and cost dedicated to laboratory works can be reduced or even avoided 
[13,14]. 


Nevertheless, estimation of CS is a challenging task. The reason is that this mechanical property 
is dependent on various factors such as mix proportions and concrete age. Concrete is a highly 
nonhomogeneous material with a diverse set of constituents. Furthermore, the mapping function 
between the CS of a concrete mix and its components has been generally demonstrated to be 
complex and nonlinear [15]. Therefore, conventional regression analysis and equation-based 
models used for estimating the CS of concrete mixes often fall short of the industry's 
requirements [16—19]. 


In recent years, with the advancements of machine learning (ML) and computing power, 
researchers have increasingly relied on intelligent data-driven approach for predicting CS of 
concrete mixes based on their constituents and age [20,21]. ML-based models have demonstrated 
promising capabilities in capturing the nonlinear and multivariate relationships between concrete 
strength and its influencing factors. State-of-the-art regression analysis approaches such as 
artificial neural networks, fuzzy neural network, deep neural computing, boosting machines, 
ensembles of decision trees, etc. can not only learn these functional relationships with a high 
degree of precision in the learning phase but also perform well in the estimation of unseen data 
in the testing phase [22-31]. 
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Artificial neural network (ANN) has been used in [32] to estimate the CS prediction of 
environmentally friendly concrete. ANN and adaptive fuzzy neural inference system (ANFIS) 
have been used in [33] to construct models for predicting the CS of regular and high- 
performance concretes. The authors compare different training schemes including the Grey Wolf 
Optimizer metaheuristic and the Levenberg-Marquardt (LM) algorithm. It is experimentally 
found that the ANN model trained with the LM algorithm achieves the most desired outcome. 
Czarnecki et al. [20] presents an integration of the self-organizing feature map and ANN to 
predict the CS of cementitious composites with ground granulated blast furnace slag. 


Shahmansouri et al. [34] and Moradi et al. [35] both demonstrate the potentiality of ANN in 
modeling the CS of concrete mixes. The former work shows that ANN can achieve predictive 
performance that is better than that of gene expression programming. The latter once again 
confirms the finding of [33] which shows good outcomes obtained from the LM-based ANN 
model. Nevertheless, one notable disadvantage of the LM algorithm is that it requires the 
computation and storage of the Jacobian matrices. These matrices often become enormously 
large for big datasets and deep neural networks that involve multiple hidden layers. In addition, 
limitations of the conventional shallow backpropagation ANN in modeling complex engineering 
processes were also pointed out in [28-36]. 


Golafshani and Behnood [37] proposes a novel integration of ANN and multi-verse optimizer for 
predicting mechanical properties of sustainable concrete containing waste foundry sand. The 
capability of neural networks to model complex estimation tasks in civil engineering was 
demonstrated in [38]. Faraj et al. [39] constructs a data-driven approach for inferring the CS of 
eco-friendly self-compacting concrete incorporating ground granulated blast furnace; the ANN 
has been used as the function approximator and has achieved a good correlation of determination 
with R* = 0.955. Rezazadeh et al. [40] recently demonstrated the superiority of ANN over the 
Genetic Programming and the Combinatorial Group Method of Data Handling approaches. 


Ahmed et al. [41] investigates the capability of an ANN model and a M5P-tree for predicting the 
CS of geopolymer concrete incorporated with nano-silica. Pan et al. [42] successfully integrates 
genetic algorithm (GA) and ANN to establish a hybrid intelligent model for estimating the CS of 
green concrete. Zhang et al. [43] develops a model for predicting the mechanical properties of 
manufactured-sand concrete using tree-based models. These tree-based models include 
regression tree, random forest, and gradient boosted regression tree. In addition, the Firefly 
algorithm (FA) has been integrated with the tree-based models to optimize their model selection 
phases. Although the hybrid GA-ANN and FA-tree models demonstrate good predictive 
outcomes, the main concern of the proposed framework is the high computational cost required 
for training or optimizing the prediction models. It is because both GA and FA are population- 
based metaheuristics. Therefore, a large number of function evaluations is required to adapt the 
ML-based models. 


Recently, deep artificial neural network regression (DANNR) has gained the increasing attention 
of researchers in the field of modeling concrete strength. The basic idea of a deep ANN is to 
create a network with a deep hierarchical organization of hidden layers. Each layer can distill and 
generalize data from the previous layers to more informative signals that is transferred to the 
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subsequent layers. Each hidden layer plays the role of a feature learning or engineering operator. 
In a normal shallow ANN, there is only one hidden layer that extracts and learns the feature from 
the input data. Meanwhile, in a DANNR, various feature learners can be stacked to generate 
increasingly informative signals used for function approximation. With such advantages, 
DANNR-based models are capable of coping with nonlinear and multivariate datasets [44]. 


Accordingly, ML-based models for estimating the CS of concrete incorporating waste marble 
powder have been recently put forward in [45]. The authors rely on a neural computing model 
with 3 hidden layers and this deep ANN model is trained by the Adaptive Moment Estimation 
(Adam). This deep ANN model demonstrates a competitive performance compared to the 
Extreme Gradient Boosting Machine (XGBoost). However, this study has not explored the 
potentiality of other state-of-the-art optimizers (e.g. Root Mean Squared Propagation or Adaptive 
Moment Estimation with Nesterov momentum) for training the deep ANN model. Haque et al. 
[46] relies on a DANNR for estimating the strength of fly ash-based magnesium phosphate 
cement mortar. This study shows the desired performance of a deep ANN with two hidden layers 
and the hyperbolic tangent sigmoid activation function. Nevertheless, the effectiveness of other 
activation functions has not been explored in this paper. Asghari et al. [44] proves the superiority 
of DANNR-based models in predicting the undrained shear strength of clays; the DANNR-based 
models have outperformed conventional regression and equation-based approaches. 


According to the existing works, an increasing trend of utilizing sophisticated ML models and 
deep neural networks for predicting the CS of concrete can be observed. However, few studies 
have explored the potentiality of advanced gradient descent based-optimizers for training 
DANNR models. With such motivations, this study aims to compare the performances of 
DANNR models using different advanced optimizers in estimating the CS of concrete containing 
manufactured sand. The optimizers of the Adam, Root Mean Squared Propagation (RMSprop) 
and Adaptive Moment Estimation with Nesterov momentum (Nadam) are employed. Although 
deep learning has been used to estimate the CS of concrete, few studies have been dedicated to 
comparing the performance of different advanced optimizers used for training DANNR-based 
CS prediction models. Therefore, the current work is an attempt to fill this gap in the literature. 


The subsequent sections of the study are presented as follows: The next section summarizes the 
research method that covers the DANNR, the used optimizers, and the employed datasets of 
concrete containing manufactured sand. The third part presents the findings of the current work. 
The conclusion is provided in the final section. 


2. Research method 


2.1. Deep artificial neural network regression (DANNR) 


A deep neural network model generally comprises an input layer, a set of hidden layers, and an 
output layer [47]. The input layer is basically an external signal receiver and the output layer 
simply processes the results of the last hidden layer and yields the predicted dependent variable 
(e.g CS). In deep neural networks, there are multiple hidden layers containing neurons for 
processing a dataset and generalizing a mapping function between the input signal x (e.g 
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concrete constituents and age) and the output y (e.g CS). Generally, the stacked hidden layers 
enhance the network’s robustness in generalizing non-linear mapping functions; the suitable 
numbers of hidden layers and neurons in each hidden layer are data-dependent and should be 
determined experimentally [48]. A typical DANNR’s structure is depicted in Fig. 1. Herein, there 
are D input variables (x1,x2,...,xp) that represent the characteristics of a concrete mix. 


Hidden layers Output layer 


Input layer 


Predicted 
Concrete Strength 


Fig. 1.A re structure. 


Table 1 
Activation functions used in a DANNR. 
Activation function Formula Derivative 
1 
Logistic sigmoid f (x) = ——— f' CO = fC) xd— f)) 
1+exp(—x) 
. ee _ exp(x) —exp(-x) f@=1-£/@ 
Hyperbolic tangent sigmoid f(x) ae re a 
ta KOU 1, if x>0 
Rectified linear unit activation f(x)= Pian= 
0, if x <0 0, if x <0 
Linear f(x)=x fi (x)= 


In Fig. 1, f4 denotes an activation function. In the input and hidden layers, nonlinear activation 
functions are often used to learn a nonlinear target function. The commonly used activation 
functions may include logistic sigmoid, hyperbolic tangent sigmoid, and rectified linear unit 
activation (ReLU) [47-49]. For a DANNR that performs function approximation tasks, the 
output layer simply employs a linear activation function. The used activation functions and their 
derivatives are summarized in Table 1. 


To train a DANNR model used for estimating the CS of concrete mixes consisting of 
manufactured sand, the back-propagation and gradient-descent algorithms are used to adapt the 
connection weights between different layers. Herein, the connection weights in each layer is 
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stored in the form of a matrix W. In the forward pass, the input layer obtains the signals 
representing the characteristics of a concrete mix (x) and transfers them through the hidden 
layers and the output layer. The output layer yields the estimated CS (y). To reduce the error (<) 
between the observed CS (f) and the predicted one (y), the backward pass is performed. In the 
backward pass, « is reversely transmitted to each precedent layer and the network’s weights W 
are optimized via the gradient-descent algorithm. 


The back-propagation algorithm requires the calculation of the gradient of a loss function which 
is used to determine the direction and the amount of the update for each weighting value [49]. 
For regression analysis, the commonly-used loss function is the Squared Error Loss (SEL) 
[35,36,45—-50,37-44]. This loss function basically yields the squared difference between the 
actual and predicted CS. The SEL is given by: 


SEL(t, y) = xt yy? (1) 


Additionally, a common problem faced during the training phase of a DANNR is how to 
alleviate overfitting. This phenomenon usually occurs when a deep learning model performs 
exceptionally well in the model construction phase but the model’s estimations of unseen data 
are highly inaccurate. One effective method for mitigating overfitting is weight regularization. 
This approach prevents overfitting by constraining the magnitude of the network’s weights. To 
do so, additional terms are included in the loss function [49-52]. Generally, there are two forms 
of weight regularization: L1 and L2. In the former case, the L1-norm of a network weight w is 
added to the loss function. In the latter case, the L2-norm of a network weight w is used. 
Accordingly, the modified loss functions are given by: 


Li-norm: L(t, y) = SEL(t, y)+ A || w ||; (2) 
L2-norm: L(t, y) = SEL(t, y) +A || w]|; (3) 
where A is a hyper-parameter of the loss functions. A large A significantly prohibits a large value 
of the network weight. 


Using the backpropagation framework, the partial derivatives of the loss function with respect to 
each connection weight must be specified. The readers are guided to the previous works of [53] 
and [47] to acquire the equations employed for adapting a model’s weights. Herein, we focus on 
the equation used to update the connection weights in the output layer. These connection weights 
are directly associated with the derivative of the loss function L() with respect to the predicted 
output variable y. In detail, if the standard loss function L(¢) = SEL(€) is utilized, the partial 


derivative of L(¢) with respect to the i” connection weight in the output layer w’ is presented as 


follows: 


OL OL oy Oy 
=—x =-(t x 4 
dw? dy aw? oD) dw? 


U 


i 
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2.2. Advanced optimizers for training deep neural networks 


2.2.1. Root mean squared propagation (RMSprop) 
RMSprop, described in [54], is an improved gradient descent algorithm with the use of adaptive 
learning rate. Herein, the gradient at time step ¢ (g, = OL/ Ow) is divided by a running average of 
its magnitude [49]. This running average at time step ¢ is given by: 
v=yxvt-N+- Nar (5) 
where 7 < (0,1) and 8 es denotes the element-wise square of the gradient 8,. 

The equation used to revise the model’s weights is given by: 

= (6) 
vv l+$ 


where q is the learning rate; ¢ =1e—8 1s a small number to guarantee the numerical stability of 


w(t +1) = w(t)-ax 


the calculation process. 


2.2.2. Adaptive moment estimation (Adam) 


The Adam optimizer, presented in [55], utilizes the estimation of the first and second moments of 
the gradient via exponential moving averages and bias corrections. This algorithm also employs 
an exponentially decaying average of past gradients [56]. To update the network’s weights, it is 
first required to compute the 1“ biased moment estimation as follows: 


m, = B,xm,,+(1- f,)x 8, (7) 
where /, =0.9. 

The 2”! biased moment estimation is given by: 

v, = By xv.,+(1-B,)xg° (8) 


where , = 0.9999 is a hyper-parameter of the algorithm. 


The bias-corrected 1°‘ moment estimate is revised in the following manner: 
g 


i m, 

Mm, = 9 
' 1-6 (9) 
The bias-corrected 2°’ moment estimate is obtained via: 
v 

ie 10 

' 1-6 (10) 
Accordingly, the optimized parameters of a neural computing model are adapted via 

m, 
W, =W,_1 —@X (11) 
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2.2.3. Nesterov-accelerated adaptive moment estimation (Nadam) 


The Nadam [57] combines Nesterov accelerated gradient and the Adam optimizer. Herein, 
Nesterov momentum is used to consider the gradient at the projected future position [56]. 
Therefore, the Nadam optimizer can be effective to perform the searching process in regions of 
the loss function where the gradient is flat. The equation used to update the network’s weight 
according to the Nadam algorithm is given by: 


m, 
W, = W,_, — &X == (12) 
Vi, +6 
where q is also the learning rate parameter; ¢ = 1le—8. 
The revised 1 moment estimate i, is given by: 


t+1 t+1 


in, =(uxm, 0-][4))+-) x8, -T] a) (13) 


where m, =(1— 4) x g, + “Xm,_,and y= 0.975 denotes a hyper-parameter. 


The 2”! biased moment estimation n, and corrected moment estimation, are given by: 


n, =vxn,_,+(1-v) x g7 (14) 
A, =a (15) 
l-v 


where »= 0.999 is a hyper-parameter of the Nadam algorithm. 


2.3. The collected dataset 


The dataset consisting of testing results of manufactured-sand concrete samples has been 
collected and compiled in the previous works of [58] and [59]. The aforementioned works 
carried out experimental studies on the development of CS of concrete containing manufactured 
sand. There are 132 testing records that provide the concrete mixes’ constituents and the CS 
corresponding to different curing ages. The input factors of stone powder content, sand ratio, 
quantity of cement, quantity of water, quantity of coarse aggregate, quantity of water-reducer, 
quantity of manufactured sand, concrete slump, unit weight of concrete, and curing age are used 
as independent variables to estimate the CS as a dependen variable. Herein, the manufactured 
sand is obtained from crushed limestone with the particle size of 0-4.75 mm. Table 2 provides 
the detailed information on the CS and its predictor variables. 
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Table 2 

Statistical description of the collected dataset. 

Variables Note Min Average Std. Skewness Max 
XxX, Stone powder content (%) 5.000 9.000 3.278 0.000 13.000 
Xy Sand ratio (%) 34.000 37.636 3.999 0.883 44.000 
X3 Quantity of cement (kg/m’) 321.000 397.182 52.638 -0.324 462.000 
X Quantity of water (kg/m’) 180.000 181.364 2.235 1.032 185.000 
Xs Quantity of coarse aggregate (kg/m’) 1091.000 1166182 46.306 -1.019 1197.000 
X¢ Quantity of water-reducer (kg/m’) 2.247 2.997 0.358 -0.122 3.696 
Xy Quantity of manufactured sand (kg/m?) 613.000 707.091 96.040 0.799 858.000 
Xe Concrete slump (mm) 30.000 75.909 42.506 1.035 160.000 
Xo Unit weight of concrete (kg/m’) 2410.000 2443.758 16.647 -0.889 2463.000 
Xo Curing age (day) 3.000 132.159 120.968 0.703 388.000 
Y Compressive strength (CS) (MPa) 28.500 55.840 11.793 -0.250 78.200 


2.4. The metrics used for performance measurement 


To evaluate of the performance of the deep learning models used in this paper, a set of three 
indicators are considered; they include coefficient of determination (R*), root mean square error 
(RMSE), and mean absolute percentage error (MAPE). These indicators are widely used for 
assessing the predictive capability of regression models [45,46,55-60,47—54]. The equations 
used to compute these three indicators are presented in Table 3. It is worth noticing that that the 
closer the R’ to 1, the better the prediction outcome. In addition, small values of RMSE and 
MAPE reflect low prediction errors. R’ and MAPE are unitless. Meanwhile, the unit of the 
RMSE is MPa. 


Table 3 
The employed performance indicators. 


Indices Equation Range Ideal value 


DG = y;)” 


R R? =1- (0,1) 1 
DG. -f) 
i=l 
ie : 
RMSE RMSE = rp ae —t,) (0,+00) 0 
i=l 
100 ly -4| 
MAPE MAPE =— <5 : (0,+00) 0 
i=l i 


Note: ¢; and y; are the observed and predicted CS values of the i data instance. N denotes the 
number of data records. 
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3. Experimental results and discussion 


This part of the paper presents the experimental outcomes of the DANNR used for predicting the 
CS of manufactured-sand concrete. The deep learning models use three activation functions in 
the hidden layers: the logistic sigmoid (Sigmoid), the hyperbolic tangent sigmoid (Tanh), and 
rectified linear unit activation (ReLU). The state-of-the-art RMSprop, Adam, and Nadam 
optimizers are used to adapt the deep networks with respect to the collected dataset. Hence, there 
are nine DANNR models (RMSprop-Sigmoid-DANNR, RMSprop-Tanh-DANNR, RMSprop- 
Tanh-ReLU, Adam-Sigmoid-DANNR, Adam-Tanh-DANNR, Adam-Tanh-ReLU, Nadam- 
Sigmoid-DANNR, Nadam-Tanh-DANNR, and Nadam-Tanh-ReLU) are constructed and used for 
result comparison. It is noted that the employed deep learning models have been developed in 
MATLAB programming environment. In addition, the computational experiments in this study 
are performed with a desktop computer using the Intel(R) Core(TM) i7-10700F CPU @ 
2.90GHz and 16GB RAM. 


As mentioned earlier, the dataset includes 132 records and ten predictor variables. These 
predictor variables provide information on the concrete mix and curing age with respect to the 
output variable of the CS. In this study, to standardize the ranges of the predictor and predicted 


variables, the Z-score normalization equation is used. Thus, the orginal variables are normalized 


as follows: 
on Xo7 Hx (16) 
Ox 


where X, and X, denote the standardized and the original variables, respectively. 4, and 


oy are the mean and standard deviation of the original variable. 


The aforementioned deep learning models are trained by the stochastic gradient descent method 
with the three optimizers (RMSprop, Adam, and Nadam). The batch-size used in the stochastic 
gradient descent method is 16. In addition, the deep learning models have been trained during 
500 epochs. Furthermore, the DANNR models require the setting of their hyper-parameters 
including the learning rate, the regularization type (L1 or L2), the regularization parameter, the 
number of hidden layers as well as the number of neurons in each hidden layer. This study has 
carried out a five-fold cross validation process [61] to identify suitable settings of the DANNR 
models. Based on this cross validation process, the suitable learning rate and regularization 
parameter are 0.01 and 0.001, respectively. In addition, the configurations of the DANNR 


models are summarized in Table 4. 
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Table 4 
Configurations of the DANNR models. 
Regularization Number of hidden Number of neurons in each 
Models : 
type layers hidden layer 
RMSprop-Sigmoid-DANNR L2 3 12 
RMSprop-Tanh-DANNR Ll 9 10 
RMSprop- ReLU-DANNR L2 2 10 
Adam-Sigmoid-DANNR L2 4 12 
Adam-Tanh-DANNR 12 ?) 8 
Adam-ReLU-DANNR Ll 2 8 
Nadam-Sigmoid-DANNR L2 3 12 
Nadam-Tanh-DANNR Ll 2 10 
Nadam-ReLU-DANNR Ll 2 12 
Table 5 
Prediction results of the DANNR models. 
ss Sigmoid-DANNR Tanh-DANNR ReLU-DANNR 
Optimizer Phase Indices 
Mean Std Mean Std Mean Std 
RMSE 1.360 0.115 1.427 0.105 1.677 0.262 
Training MAPE (%) 2.009 0.195 2.077 0.184 2.409 0.363 
ig 0.986 0.002 0.985 0.002 0.979 0.007 
RMSprop 
RMSE 2.399 0.571 3.230 1.502 2.944 0.764 
Testing MAPE (%) 3.720 1.056 4.060 1.605 4.580 1.658 
R 0.955 0.019 0.903 0.090 0.929 0.037 
RMSE 1.255 0.098 1.331 0.149 2.444 0.726 
Training MAPE (%) 1.863 0.149 1.951 0.234 3.527 1.052 
R? 0.989 0.002 0.987 0.003 0.953 0.028 
Adam 
RMSE 2.098 0.540 2.881 1.167 3.203 1.035 
Testing MAPE (%) 3.105 0.763 4.193 1.685 4.798 1.999 
R 0.958 0.032 0.921 0.085 0.912 0.045 
RMSE 1.364 0.075 1.452 0.130 2.320 0.393 
Training MAPE (%) 2.011 0.140 2152 0.227 3.382 0.573 
R 0.986 0.001 0.985 0.003 0.960 0.013 
Nadam 
RMSE 1.952 0.683 2.698 0.773 3.847 1.128 
Testing MAPE(%) 3.043 1.261 3.878 1.178 5.825 1.826 
R 0.970 0.018 0.922 0.050 0.861 0.095 


Using the configurations identified by the cross validation processes, a repeated sampling of the 
collected data in which 90% of the dataset is used for model training and 10% of the dataset is 
used for model testing is carried out 20 times. This repeated sampling process aims at mitigating 
the bias in model evaluation due to the randomness in data selection. The prediction results of the 
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nine DANNR models used for predicting the CS of manufactured-sand concrete are reported in 
Table 5. Overall, all deep learning models yield accurate and reliable predictions of the CS as 
shown by low values of RMSE and MAPE as well as high values of R’. Nevertheless, the 
DANNR using the logistic sigmoid function and trained by the Nadam optimizer outperforms 
other benchmark approaches. The Nadam-Sigmoid-DANNR yields the highest predictive 
accuracy with the average RMSE = 1.952, MAPE = 3.043%, and R? = 0.97. Notably, R? is the 
proportion of the variation in the CS that can be estimated from the DANNR that uses the set of 
the ten predictor variables. This means that 97% of the total variation in the CS of manufactured- 
sand concrete can be explained by the deep learning model. 


The Adam-Sigmoid-DANNR (with RMSE = 2.098, MAPE = 3.105%, R° = 0.958) and 
RMSprop-Sigmoid-DANNR (with RMSE = 2.399, MAPE = 3.725%, R* = 0.955) are the second 
and third best models, respectively. This fact point outs that DANNR used with the logistic 
sigmoid activation function is highly suitable with the dataset at hand. The DANNR with the 
ReLU activation function (RMSE = 2.944) is slightly better than the one with the Tanh function 
(RMSE = 3.230) when the RMSprop is used. However, when the DANNR models are trained by 
the Adam and Nadam algorithms, the models using the Tanh function always excel the ones 
using the ReLU function. 


Table 6 
The computational (com.) time of the DANNR models. 


DANNR RMSprop RMSprop- RMSprop- Adam- Adam- Adam- Nadam- Nadam- Nadam- 


models -Sigmoid Tanh ReLU Sigmoid Tanh ReLU Sigmoid Tanh ReLU 
Average 
com. time 2.19 1.60 1.59 2.51 1.55 1.58 2.05 1.62 1.66 


(s) 


Moreover, the average computational time of the DANNR models is reported in Table 6. In 
general, the computational times of the DANNR models using the Sigmoid function are higher 
than those of other models. The training progresses of the deep learning models are demonstrated 
in Fig. 2. Apparently, the convergence rates of the deep learning models using Sigmoid and Tanh 
functions are faster than those of the models employing the ReLU function. Therefore, it can be 
observed that the Sigmoid and Tanh functions are more suitable for modeling the current dataset 
than the ReLU function. Compared to the ReLU function, the Sigmoid and Tanh activation 
functions can help attain better prediction accuracy with a slight increase in computational 
expense. 


The boxplots illustrating the prediction performances of the DANNR models after 20 
independent runs are shown in Fig. 3. Based on the boxplots, the RMSprop-Tanh-DANNR, 
Adam-ReLU-DANNR, and Nadam-ReLU-DANNR_ demonstrate relatively unstable 
performances; their ranges between the minimum RMSE and maximum RMSE are considerably 
wider compared to those of other models. Additionally, the median (shown as a red line) of the 
Nadam-Sigmoid-DANNR is the lowest among all of the employed models. 
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Fig. 2. Training progresses of the DANNR models. 
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Fig. 3. Boxplots of the models’ performance obtained from 20 independent runs: (a) RMSprop optimizer, 
(b) Adam optimizer, and (c) Nadam optimizer. 
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As mentioned earlier, the Nadam-Sigmoid-DANNR has obtained a good prediction performance 
with RMSE = 1.952, MAPE = 3.043%, and R? = 0.97. This result can be benchmarked with 
machine learning models previously used for estimating the CS of manufactured sand concrete. 
In [19], Adaptive Neuro Fuzzy Inference System (ANFIS) and feedforward Artificial Neural 
Network (ANN) have been employed. ANFIS and ANN attain the RMSE = 6.46 and 7.67, 
respectively. In addition, Ly et al. [19] also enhances the ANFIS model by using the Teaching- 
Learning-Based Optimization (TLBO); the hybrid ANFIS-TLBO yields a better data fitting with 
RMSE = 4.93. The model proposed by Zhang et al. [43] combines gradient boosted regression 
tree (GBRT) and Firefly algorithm (FA); The latter algorithm is employed to optimize the tuning- 
parameters of the former algorithm. The model has yielded the RMSE = 3.346 [43]. Based on the 
results reported in the previous studies, it can be seen that the Nadam-Sigmoid-DANNR 
proposed in this study has provided a promising prediction performance in estimating the CS of 
manufactured sand concrete. 


Fig. 4 and Fig. 5 demonstrate the goodness of fit obtained by the Nadam-Sigmoid-DANNR 
model. Although the proposed method has attained a high degree of fit, its results show certain 
deviations from the actual CS. The absolute deviations (or residuals) of the proposed deep 
learning model is demonstrated in Fig. 6. The histogram of the model’s residuals is presented in 
Fig. 7. The maximum, minimum, and average residual are 8.4540 MPa, 0.001 MPa, and 1.4321 
MPa, respectively. This discrepancy between the estimated and observed CS is understandable. It 
is because the prediction of CS is highly complex due to the nonlinear and multivariate nature of 
the estimation task [15]. Moreover, a certain degree of uncertainty always exists in the 


experimental and testing processes used to measure the CS value of a concrete mix. 


Line of best fit: R? = 0.970 
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Fig. 4. Correlation between actual output and predicted output obtained by the Nadam-Sigmoid-DANNR 
model. 
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Fig. 7. Histogram of the Nadam-Sigmoid-DANNR model. 
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4. Conclusion 


Prediction of the CS of manufactured-sand concrete based on its constituents and curing age is 
crucial for concrete mix design. In addition, the development of the CS at various ages is a 
complex phenomenon that involves the interplay of multiple predictor variables. Although 
various machine learning methods have been put forward to construct data-driven tools for 
concrete strength estimation, few studies have investigated the capabilities of deep neural 
network regression models in the task of interest. 


This paper has proposed and verified a deep learning-based solution for achieving accurate 
estimations of the CS of manufactured-sand concrete. The DANNR models are trained with the 
advanced RMSprop, Adam, and Nadam optimizers. The research findings show that the Nadam- 
optimized DANNR with the Sigmoid activation function can help achieve the most accurate 
predictions of the CS with RMSE = 1.952, MAPE = 3.043%, and R? = 0.97. Therefore, the 
Nadam-Sigmoid-DANNR model is recommended for practical purposes because it can help to 
mitigate the time and cost dedicated to laboratory work. 
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